Goto

Collaborating Authors

 Yucatán



Generative AI for Self-Adaptive Systems: State of the Art and Research Roadmap

Li, Jialong, Zhang, Mingyue, Li, Nianyu, Weyns, Danny, Jin, Zhi, Tei, Kenji

arXiv.org Artificial Intelligence

Self-adaptive systems (SASs) are designed to handle changes and uncertainties through a feedback loop with four core functionalities: monitoring, analyzing, planning, and execution. Recently, generative artificial intelligence (GenAI), especially the area of large language models, has shown impressive performance in data comprehension and logical reasoning. These capabilities are highly aligned with the functionalities required in SASs, suggesting a strong potential to employ GenAI to enhance SASs. However, the specific benefits and challenges of employing GenAI in SASs remain unclear. Yet, providing a comprehensive understanding of these benefits and challenges is complex due to several reasons: limited publications in the SAS field, the technological and application diversity within SASs, and the rapid evolution of GenAI technologies. To that end, this paper aims to provide researchers and practitioners a comprehensive snapshot that outlines the potential benefits and challenges of employing GenAI's within SAS. Specifically, we gather, filter, and analyze literature from four distinct research fields and organize them into two main categories to potential benefits: (i) enhancements to the autonomy of SASs centered around the specific functions of the MAPE-K feedback loop, and (ii) improvements in the interaction between humans and SASs within human-on-the-loop settings. From our study, we outline a research roadmap that highlights the challenges of integrating GenAI into SASs. The roadmap starts with outlining key research challenges that need to be tackled to exploit the potential for applying GenAI in the field of SAS. The roadmap concludes with a practical reflection, elaborating on current shortcomings of GenAI and proposing possible mitigation strategies.


A Concise Review of Hallucinations in LLMs and their Mitigation

Pulkundwar, Parth, Dhanawade, Vivek, Yadav, Rohit, Sonkar, Minal, Asurlekar, Medha, Rathod, Sarita

arXiv.org Artificial Intelligence

Abstract--Traditional language models face a challenge from hallucinations. Their very presence casts a large, dangerous shadow over the promising realm of natural language processing. It becomes crucial to understand the various kinds of hallucinations that occur nowadays, their origins, and ways of reducing them. This document provides a concise and straightforward summary of that. It serves as a one-stop resource for a general understanding of hallucinations and how to mitigate them. In the fast-moving world of Natural Language Processing (NLP) today, large language models (LLMs) such as GPT, BERT, and others have become the principal agents of change in natural language processing. They can generate human-like text, answer multifaceted questions, or engage in conversation with as much fluency.


A Probabilistic Framework for Temporal Distribution Generalization in Industry-Scale Recommender Systems

Zhu, Yuxuan, Fu, Cong, Ni, Yabo, Zeng, Anxiang, Fang, Yuan

arXiv.org Artificial Intelligence

Temporal distribution shift (TDS) erodes the long-term accuracy of recommender systems, yet industrial practice still relies on periodic incremental training, which struggles to capture both stable and transient patterns. Existing approaches such as invariant learning and self-supervised learning offer partial solutions but often suffer from unstable temporal generalization, representation collapse, or inefficient data utilization. To address these limitations, we propose ELBO$_\text{TDS}$, a probabilistic framework that integrates seamlessly into industry-scale incremental learning pipelines. First, we identify key shifting factors through statistical analysis of real-world production data and design a simple yet effective data augmentation strategy that resamples these time-varying factors to extend the training support. Second, to harness the benefits of this extended distribution while preventing representation collapse, we model the temporal recommendation scenario using a causal graph and derive a self-supervised variational objective, ELBO$_\text{TDS}$, grounded in the causal structure. Extensive experiments supported by both theoretical and empirical analysis demonstrate that our method achieves superior temporal generalization, yielding a 2.33\% uplift in GMV per user and has been successfully deployed in Shopee Product Search. Code is available at https://github.com/FuCongResearchSquad/ELBO4TDS.


Inside the world's longest underwater cave: Subterranean water 'web' in Mexico extends at least 325 MILES

Daily Mail - Science & tech

Leaked recording reveals Campbell's exec's sickening remarks about iconic soup's ingredients How Lauren Sanchez would REALLY look if she'd never had rumored plastic surgery Trump's losing control... MAGA's imploding... and White House insiders tell me why they're REALLY worried: ANDREW NEIL Billionaire family posts VERY unusual obituary after heir, 40, met violent end at $2.8m hunting lodge following marriage scandal These women have lost as much as nine stone WITHOUT jabs: Now they reveal secret to their stunning success, the extraordinary event that brought them together and how it's changed their lives... Judge throws out Comey and James cases as Trump's beauty queen prosecutor is humiliated Her moving videos about the handsome boyfriend who ghosted her went viral and catapulted her to overnight fame. Kate Gosselin's ex Jon is seen at his splashy wedding for the first time as son Collin weighs in on his siblings not attending Fugitive'Slender Man' stabber Morgan Geyser snapped'just Google me' when asked for ID by cops who found her with MUCH older lover It all seems to be falling apart now! Pete Hegseth drops hammer on Democrat senator in'sedition' storm as court martial looms after Trump's execution threat Sabrina Carpenter looks unrecognisable in throwback snap from seven years ago as fans call her rebranding'wild' Neuralink's'Patient 4' feared missing months after getting revolutionary brain chip... now his wife tells the REAL heartbreaking story NFL's first transgender cheerleader makes explosive allegation against Carolina Panthers Slash your cholesterol by a third in just a month... hundreds of thousands are on a new diet that's transforming lives. Inside the world's longest underwater cave: Subterranean water'web' in Mexico extends at least 325 MILES Beneath the idyllic resort towns of Mexico's Yucatan Peninsula, daring explorers have uncovered a hidden world of grand chambers and twisting tunnels. The Ox Bel Ha, Mayan for'Three Paths of Water', is a sprawling water'web' that makes up the world's longest underwater cave system.


83b7da3ed13f06c13ce82235c8eedf35-Paper-Conference.pdf

Neural Information Processing Systems

Despite the remarkable capabilities demonstrated by Graph Neural Networks (GNNs) in graph-related tasks, recent research has revealed the fairness vulnerabilities in GNNs when facing malicious adversarial attacks. However, all existing fairness attacks require manipulating the connectivity between existing nodes, which may be prohibited in reality. To this end, we introduce a N ode I njection-based F airness A ttack (NIFA), exploring the vulnerabilities of GNN fairness in such a more realistic setting. In detail, NIFA first designs two insightful principles for node injection operations, namely the uncertainty-maximization principle and homophily-increase principle, and then optimizes injected nodes' feature matrix to further ensure the effectiveness of fairness attacks. Comprehensive experiments on three real-world datasets consistently demonstrate that NIFA can significantly undermine the fairness of mainstream GNNs, even including fairness-aware GNNs, by injecting merely 1% of nodes. We sincerely hope that our work can stimulate increasing attention from researchers on the vulnerability of GNN fairness, and encourage the development of corresponding defense mechanisms.


Towards Ecologically Valid LLM Benchmarks: Understanding and Designing Domain-Centered Evaluations for Journalism Practitioners

Li, Charlotte, Hagar, Nick, Nishal, Sachita, Gilbert, Jeremy, Diakopoulos, Nick

arXiv.org Artificial Intelligence

Benchmarks play a significant role in how researchers and the public understand generative AI systems. However, the widespread use of benchmark scores to communicate about model capabilities has led to criticisms of validity, especially whether benchmarks test what they claim to test (i.e. construct validity) and whether benchmark evaluations are representative of how models are used in the wild (i.e. ecological validity). In this work we explore how to create an LLM benchmark that addresses these issues by taking a human-centered approach. We focus on designing a domain-oriented benchmark for journalism practitioners, drawing on insights from a workshop of 23 journalism professionals. Our workshop findings surface specific challenges that inform benchmark design opportunities, which we instantiate in a case study that addresses underlying criticisms and specific domain concerns. Through our findings and design case study, this work provides design guidance for developing benchmarks that are better tuned to specific domains.


Mystery Mayan ruler was no king

Popular Science

Ix Ch'ak Ch'een was one of at least four women who oversaw the city of Cobá. Breakthroughs, discoveries, and DIY tips sent every weekday. Ongoing analysis of an ancient monument among the Mayan ruins at Cobá has revealed the identity of one of the sprawling city's previously unknown rulers. According to archaeologists with Mexico's National Institute of Anthropology and History (INAH), the king referenced multiple times in the historical accounts described on the city's Foundation Rock wasn't a king at all. She was a queen named Ix Ch'ak Ch'een.


LCDB 1.1: A Database Illustrating Learning Curves Are More Ill-Behaved Than Previously Thought

Yan, Cheng, Mohr, Felix, Viering, Tom

arXiv.org Artificial Intelligence

Sample-wise learning curves plot performance versus training set size. They are useful for studying scaling laws and speeding up hyperparameter tuning and model selection. Learning curves are often assumed to be well-behaved: monotone (i.e. improving with more data) and convex. By constructing the Learning Curves Database 1.1 (LCDB 1.1), a large-scale database with high-resolution learning curves including more modern learners (CatBoost, TabNet, RealMLP and TabPFN), we show that learning curves are less often well-behaved than previously thought. Using statistically rigorous methods, we observe significant ill-behavior in approximately 15% of the learning curves, almost twice as much as in previous estimates. We also identify which learners are to blame and show that specific learners are more ill-behaved than others. Additionally, we demonstrate that different feature scalings rarely resolve ill-behavior. We evaluate the impact of ill-behavior on downstream tasks, such as learning curve fitting and model selection, and find it poses significant challenges, underscoring the relevance and potential of LCDB 1.1 as a challenging benchmark for future research.


ChoirRec: Semantic User Grouping via LLMs for Conversion Rate Prediction of Low-Activity Users

Zhai, Dakai, Gao, Jiong, Du, Boya, Xu, Junwei, Shen, Qijie, Zhu, Jialin, Jiang, Yuning

arXiv.org Artificial Intelligence

Accurately predicting conversion rates (CVR) for low-activity users remains a fundamental challenge in large-scale e-commerce recommender systems. Existing approaches face three critical limitations: (i) reliance on noisy and unreliable behavioral signals; (ii) insufficient user-level information due to the lack of diverse interaction data; and (iii) a systemic training bias toward high-activity users that overshadows the needs of low-activity users. To address these challenges, we propose ChoirRec, a novel framework that leverages the semantic capabilities of Large Language Models (LLMs) to construct semantic user groups and enhance CVR prediction for low-activity users. With a dual-channel architecture designed for robust cross-user knowledge transfer, ChoirRec comprises three components: (i) a Semantic Group Generation module that utilizes LLMs to form reliable, cross-activity user clusters, thereby filtering out noisy signals; (ii) a Group-aware Hierarchical Representation module that enriches sparse user embeddings with informative group-level priors to mitigate data insufficiency; and (iii) a Group-aware Multi-granularity Modual that employs a dual-channel architecture and adaptive fusion mechanism to ensure effective learning and utilization of group knowledge. We conduct extensive offline and online experiments on Taobao, a leading industrial-scale e-commerce platform. ChoirRec improves GAUC by 1.16\% in offline evaluations, while online A/B testing reveals a 7.24\% increase in order volume, highlighting its substantial practical value in real-world applications.